162 research outputs found

    Understanding Deep Convolutional Networks

    Full text link
    Deep convolutional networks provide state of the art classifications and regressions results over many high-dimensional problems. We review their architecture, which scatters data with a cascade of linear filter weights and non-linearities. A mathematical framework is introduced to analyze their properties. Computations of invariants involve multiscale contractions, the linearization of hierarchical symmetries, and sparse separations. Applications are discussed.Comment: 17 pages, 4 Figure

    Wavelet Scattering on the Pitch Spiral

    Full text link
    We present a new representation of harmonic sounds that linearizes the dynamics of pitch and spectral envelope, while remaining stable to deformations in the time-frequency plane. It is an instance of the scattering transform, a generic operator which cascades wavelet convolutions and modulus nonlinearities. It is derived from the pitch spiral, in that convolutions are successively performed in time, log-frequency, and octave index. We give a closed-form approximation of spiral scattering coefficients for a nonstationary generalization of the harmonic source-filter model.Comment: Proceedings of the 18th International Conference on Digital Audio Effects (DAFx-15), Trondheim, Norway, Nov 30 - Dec 3, 2015, pp. 429--432. 4 pages, 3 figure

    Audio Texture Synthesis with Scattering Moments

    Full text link
    We introduce an audio texture synthesis algorithm based on scattering moments. A scattering transform is computed by iteratively decomposing a signal with complex wavelet filter banks and computing their amplitude envelop. Scattering moments provide general representations of stationary processes computed as expected values of scattering coefficients. They are estimated with low variance estimators from single realizations. Audio signals having prescribed scattering moments are synthesized with a gradient descent algorithms. Audio synthesis examples show that scattering representation provide good synthesis of audio textures with much fewer coefficients than the state of the art.Comment: 5 pages, 2 figure

    Generative networks as inverse problems with Scattering transforms

    Full text link
    Generative Adversarial Nets (GANs) and Variational Auto-Encoders (VAEs) provide impressive image generations from Gaussian white noise, but the underlying mathematics are not well understood. We compute deep convolutional network generators by inverting a fixed embedding operator. Therefore, they do not require to be optimized with a discriminator or an encoder. The embedding is Lipschitz continuous to deformations so that generators transform linear interpolations between input white noise vectors into deformations between output images. This embedding is computed with a wavelet Scattering transform. Numerical experiments demonstrate that the resulting Scattering generators have similar properties as GANs or VAEs, without learning a discriminative network or an encoder.Comment: International Conference on Learning Representations, 201

    Rigid-Motion Scattering for Texture Classification

    Full text link
    A rigid-motion scattering computes adaptive invariants along translations and rotations, with a deep convolutional network. Convolutions are calculated on the rigid-motion group, with wavelets defined on the translation and rotation variables. It preserves joint rotation and translation information, while providing global invariants at any desired scale. Texture classification is studied, through the characterization of stationary processes from a single realization. State-of-the-art results are obtained on multiple texture data bases, with important rotation and scaling variabilities.Comment: 19 pages, submitted to International Journal of Computer Visio

    Phase retrieval for the Cauchy wavelet transform

    Full text link
    We consider the phase retrieval problem in which one tries to reconstruct a function from the modulus of its wavelet transform. We study the unicity and stability of the reconstruction. In the case where the wavelets are Cauchy wavelets, we prove that the modulus of the wavelet transform uniquely determines the function up to a global phase. We show that the reconstruction operator is continuous but not uniformly continuous. We describe how to construct pairs of functions which are far away in L2L^2-norm but whose wavelet transforms are very close, in modulus. The principle is to modulate the wavelet transform of a fixed initial function by a phase which varies slowly in both time and frequency. This construction seems to cover all the instabilities that we observe in practice; we give a partial formal justification to this fact. Finally, we describe an exact reconstruction algorithm and use it to numerically confirm our analysis of the stability question.Comment: Acknowledgments update

    Classification with Scattering Operators

    Full text link
    A scattering vector is a local descriptor including multiscale and multi-direction co-occurrence information. It is computed with a cascade of wavelet decompositions and complex modulus. This scattering representation is locally translation invariant and linearizes deformations. A supervised classification algorithm is computed with a PCA model selection on scattering vectors. State of the art results are obtained for handwritten digit recognition and texture classification.Comment: 6 pages. CVPR 201

    Deep Learning by Scattering

    Full text link
    We introduce general scattering transforms as mathematical models of deep neural networks with l2 pooling. Scattering networks iteratively apply complex valued unitary operators, and the pooling is performed by a complex modulus. An expected scattering defines a contractive representation of a high-dimensional probability distribution, which preserves its mean-square norm. We show that unsupervised learning can be casted as an optimization of the space contraction to preserve the volume occupied by unlabeled examples, at each layer of the network. Supervised learning and classification are performed with an averaged scattering, which provides scattering estimations for multiple classes.Comment: 10 pages, 1 figur

    Geometric Models with Co-occurrence Groups

    Full text link
    A geometric model of sparse signal representations is introduced for classes of signals. It is computed by optimizing co-occurrence groups with a maximum likelihood estimate calculated with a Bernoulli mixture model. Applications to face image compression and MNIST digit classification illustrate the applicability of this model.Comment: 6 pages, ESANN 201

    Deep Scattering Spectrum

    Full text link
    A scattering transform defines a locally translation invariant representation which is stable to time-warping deformations. It extends MFCC representations by computing modulation spectrum coefficients of multiple orders, through cascades of wavelet convolutions and modulus operators. Second-order scattering coefficients characterize transient phenomena such as attacks and amplitude modulation. A frequency transposition invariant representation is obtained by applying a scattering transform along log-frequency. State-the-of-art classification results are obtained for musical genre and phone classification on GTZAN and TIMIT databases, respectively
    • …
    corecore